Gene expression analysis of HIV patients with and without M. tuberculosis co-infection

Group 3: Anne Skov-Johannessen s184330, Dea F. Skipper s184324, Helene B. L. Petersen s194699, Johanne B. Overgaard s194691 and Rebecca C. Grenov s184344

Introduction

The leading cause of death in HIV-infected individuals.

  • Weakened immune system

  • Limits sensitivity of diagnosis of TB

Support vector machine to find 251-gene signature

  • Genes involved in Immunological, Infectious and Inflammatory Disease

Our aim:

  • Explore genes with a significant expression enriched in HIV with TB co-infection

  • Compare with the 251-gene signature found with the SVM model

Reference: (Dawany, N.)

Method

Data flow:

Keep it clean and tidy:

  • Select variables

  • Mutate variables

  • Handle key-variable

  • Handle replications

Methods

Methods

Normalization - minimize technical variability

Log transformation - stabilize variance, reduce skewness

Methods

Results - PCA

Variance explained by the principal components

  • First PC explains 15% of the variance

  • 31 PCs needed to explain 90% of variance

Results - PCA

Scatter plot of projected observations onto PC1 and PC2

  • Slight division of disease state on PC1

  • No clear division of gender

Results - Linear Regression

Forest plot

  • Most significant genes are down regulated

Results - Linear Regression

Volcano plot

  • None of the significant genes are among the Tuberculosis signature